Fix: Prevent crashes when HuggingFace is unreachable by mpecanha · Pull Request #152 · jamiepine/voicebox

mpecanha · 2026-02-22T09:57:55Z

Voicebox Offline Mode Fix

Problem

Voicebox crashes when generating speech if HuggingFace is unreachable, even when models are fully cached locally.

Root Cause:

Voicebox downloads mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 (MLX optimized version)
But mlx_audio.tts.load() tries to fetch config.json from original repo Qwen/Qwen3-TTS-12Hz-1.7B-Base
This network request fails → server crashes with RemoteDisconnected

Related Issues:

Issue Internet connection required, even though models are downloaded? #150: "Internet connection required, even though models are downloaded?"
Issue API Stability Issues: Model Loading Hangs and Server Crashes #151: "API Stability Issues: Model Loading Hangs and Server Crashes"

Solution

Two-part fix:

1. Monkey-patch huggingface_hub (`backend/utils/hf_offline_patch.py`)

Intercepts cache lookup functions
Forces offline mode early (before mlx_audio imports)
Adds debug logging for cache hits/misses

2. Symlink original repo to MLX version (`ensure_original_qwen_config_cached()`)

When original Qwen/Qwen3-TTS-12Hz-1.7B-Base cache doesn't exist
But MLX mlx-community/Qwen3-TTS-12Hz-1.7B-Base-bf16 does exist
Creates a symlink so cache lookups succeed

Files Changed

backend/backends/mlx_backend.py - Added patch imports at top
backend/utils/hf_offline_patch.py - New patch module

Testing

To test this fix:

Build Voicebox from source: make build
Disconnect from internet
Try generating speech
Should work without network requests

Build Instructions

# Install dependencies
pip install -r requirements.txt

# Build the app
make build

# Or build just the server
make build-server

Notes

The patch is applied automatically when mlx_backend.py is imported
Set VOICEBOX_OFFLINE_PATCH=0 to disable the patch
The symlink approach works because the config.json is compatible between versions

Patch contributed by community

Implements offline mode patch for API stability issues: - Add hf_offline_patch.py to monkey-patch huggingface_hub - Force cache-only lookups before mlx_audio imports - Create symlink from original Qwen repo to MLX community version when only MLX version is cached This fixes: - Issue jamiepine#150: Internet required even with cached models - Issue jamiepine#151: API crashes when HF network fails The patch ensures that if models are locally cached, no network requests are made to HuggingFace during speech generation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Fix: Prevent crashes when HuggingFace is unreachable#152

Fix: Prevent crashes when HuggingFace is unreachable#152
mpecanha wants to merge 1 commit intojamiepine:mainfrom
mpecanha:fix-offline-mode-crash

mpecanha commented Feb 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

mpecanha commented Feb 22, 2026

Voicebox Offline Mode Fix

Problem

Solution

1. Monkey-patch huggingface_hub (backend/utils/hf_offline_patch.py)

2. Symlink original repo to MLX version (ensure_original_qwen_config_cached())

Files Changed

Testing

Build Instructions

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

1. Monkey-patch huggingface_hub (`backend/utils/hf_offline_patch.py`)

2. Symlink original repo to MLX version (`ensure_original_qwen_config_cached()`)